The multi-resolution extended edit distance
نویسندگان
چکیده
Similarity search is a fundamental problem in information technology. The main difficulty of this problem is the high dimensionality of the data objects. In large time series databases, it’s important to reduce the dimensionality of these data objects, so that we can manage them. Symbolic representation is a promising technique of dimensionality reduction. In this paper we propose a new distance metric, which is applied to symbolic sequential data objects, and we test it on time series databases in classification task experiments. We also compare it to other distances that are well known in the literature for symbolic data objects, and we prove that it’s metric.
منابع مشابه
Parameter-Free Extended Edit Distance
The edit distance is the most famous distance to compute the similarity between two strings of characters. The main drawback of the edit distance is that it is based on local procedures which reflect only a local view of similarity. To remedy this problem we presented in a previous work the extended edit distance, which adds a global view of similarity between two strings. However, the extended...
متن کاملMining Transliterations from Wikipedia using Dynamic Bayesian Networks
Transliteration mining is aimed at building high quality multi-lingual named entity (NE) lexicons for improving performance in various Natural Language Processing (NLP) tasks including Machine Translation (MT) and Cross Language Information Retrieval (CLIR). In this paper, we apply two Dynamic Bayesian network (DBN)-based edit distance (ED) approaches in mining transliteration pairs from Wikipe...
متن کاملEfficient Algorithms for Approximate String Matching with Swaps (Extended Abstract)
Most research on the edit distance problem and the k-differences problem considered the set of edit operations consisting of changes, insertions, and deletions. In this paper we include the swap operation that interchanges two adjacent characters into the set of allowable edit operations, and we present an O(t min(m, n))-time algorithm for the extended edit distance problem, where t is the edit...
متن کاملMap Edit Distance vs. Graph Edit Distance for Matching Images
Generalized maps are widely used to model the topology of nD objects (such as 2D or 3D images) by means of incidence and adjacency relationships between cells (0D vertices, 1D edges, 2D faces, 3D volumes, ...). We have introduced in [1] a map edit distance. This distance compares maps by means of a minimum cost sequence of edit operations that should be performed to transform a map into another...
متن کاملFURY: Fuzzy Unification and Resolution Based on Edit Distance
We present a theoretically founded framework for fuzzy unification and resolution based on edit distance over trees. Our framework extends classical unification and resolution conservatively. We prove important properties of the framework and develop the FURY system, which implements the framework efficiently using dynamic programming. We evaluate the framework and system on a large problem in ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008